Improved Hidden Markov Modeling for Speaker-Independent Continuous Speech Recognition

نویسندگان

  • Xuedong Huang
  • Fil Alleva
  • Satoru Hayamizu
  • Hsiao-Wuen Hon
  • Mei-Yuh Hwang
  • Kai-Fu Lee
چکیده

This paper reports recent efforts to further improve the performance of the Sphinx system for speaker-independent continuous speech recognition. The recognition error rate is significantly reduced with incorporation of additional dynamic features, semi-continuous hidden Markov models, and speaker clustering. For the June 1990 (RM2) evaluation test set, the error rates of our current system are 4.3% and 19.9% for word-pair grammar and no grammar respectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved lexicon modeling for continuous speech recognition

We propose the stochastic lexicon model which represents the pronunciation variations to optimally cope with the continuous speech recognizer. In this lexicon model, the baseform of words are represented by subword states and probability distribution of subwords as hidden Markov model. Also, proposed approach can be applied to system employing non-linguistic recognition units and lexicon is aut...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Recognition of Speech with Non-random Attributes

Most of current speech recognition systems are based on Hidden Markov Models assuming that speech features are sequence of stationary stochastic processes. However, there are certain speech attributes, such as background noise type or speaker voice color, that do not have stochastic character. This fact is often ignored, by designers of robust speaker independent recognition system. In this wor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1990